Informatics in Education, 2015, Vol. 14, No. 2, 175-197 
©2015 Vilnius University 
DOI: 10.15388/infedu.2015.11 


175 


Testing Algorithmic Skills in Traditional and 
Non-Traditional Programming Environments 


Maria CSERNOCH 1 , Piroska BIRO 1 , 

Janos MATH 2 , Kalman ABARI 2 

1 University of Debrecen, Faculty of Informatics, Hungary 

2 University of Debrecen, Faculty of Arts and Humanities, Hungary 
e-mail: {csernoch. maria, biro.piroska}@inf.unideb.hu, 

{math.janos, abari.kalman}@arts. unideb.hu 

Received: February 2015 


Abstract. The Testing Algorithmic and Application Skills (TAaAS) project was launched in the 
2011/2012 academic year to test first year students of Informatics, focusing on their algorith¬ 
mic skills in traditional and non-traditional programming environments, and on the transference 
of their knowledge of Informatics from secondary to tertiary education. The results of the tests 
clearly show that students start their studies in Informatics with underdeveloped algorithmic skills, 
only a very few of them reaching the level of extended abstract. To find reasons for these figures 
we have analyzed the students’ problem solving approaches. It was found that the students, al¬ 
most exclusively, only consider traditional programming environments appropriate for developing 
computational thinking, algorithmic skills. Furthermore, they do not apply concept and algorith¬ 
mic based methods in non-traditional computer related activities, and as such, mainly carry out in¬ 
effective surface approach methods, as practiced in primary and secondary education. This would 
explain the gap between the expectations of tertiary education, the students’ results in the school 
leaving exams, and their overestimation of their knowledge, all of which lead to the extremely 
high attrition rates in Informatics. 

Keywords: algorithmic skills, spreadsheet, deep and surface metacognitive approaches, self-as¬ 
sessment, school leaving exams. 


1. Introduction 

That the computer has become ubiquitous is not in question. The question is how effec¬ 
tively we can use it. This simple question, however, starts an avalanche of other ques¬ 
tions. Wing stated that “Computational thinking is a fundamental skill for everyone, not 
just for computer scientists. To reading, writing, and arithmetic, we should add compu¬ 
tational thinking to every child’s analytical ability.” (Wing, 2006). Most school curricula 
have been changed in the last two decades to support this approach by emphasizing the 
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importance of the development of digital literacy and competency. In most countries 
these competences have been integrated into traditional school subjects and/or a new 
subject was introduced. This latter alternative is effectively the equivalent of formal edu¬ 
cation in CSI. However, formal education requires teachers, and teachers require teacher 
education. At this point the loop is closed, and we are faced with the chicken and the egg 
problem: who teaches the teachers if there are no teachers? In CSI education this is one 
of the most crucial questions and for an answer we have to look back in time to the emer¬ 
gence of the subject. The contradictions, both in the science itself and in the developing 
commercial world as it has interacted with the science, affect teachers, teacher education 
and consequently the development of digital competency and literacy. What should we 
teach and how should we teach CSI in and outside of formal education? 

The very first question is: “Should we teach students to program ?” In 1993 Soloway 
claimed that “...those students who wish to major in computer science should learn to 
program in school, but for them, learning to program is simply vocational training.” (So¬ 
loway, 1993). However, he did not make it clear which school(s) should be responsible 
for teaching programming. He and his colleagues went further, claiming that every¬ 
one should develop algorithmic skills, but non-traditional programming environments 
would fit non-professionals better. Twenty years later, however, we are faced with the 
problem that non-professional end-users do mainly unplanned, aimless clicking in the 
GUI and are satisfied with unchecked, non-bugged results, (Ben-Ari, 1999; Csernoch 
and Biro, 2014a, 2014b; EuSprig, 2015; OECD, 2011; Panko and Aurigemma, 2010; 
Powell et al., 2008; Tort et al., 2008; Van Deursen and Van Dijk, 2012). This behavior 
leads to the extremely high number of documents and programs carrying mistakes and 
errors, causing serious financial losses (Panko and Aurigemma, 2010). The other source 
of losses is the human factor; the time and the number of participants needed to produce 
this questionable result (Van Deursen and Van Dijk, 2012). However, what do we know 
about the professionals? About those who - according to Soloway (1993) - are supposed 
to learn programming as their vocational training. Is their level of digital competency 
higher than the level of the non-professionals? Are they prepared for high level pro¬ 
gramming when they start their tertiary education in Computer Sciences? Do they know 
what Computer Sciences are when they enter universities and colleges? We doubt it. We 
have come to realize that the terminology usage and the algorithmic skills of students 
arriving at the Faculty of Informatics - especially since the number of students in In¬ 
formatics has increased - have not developed to match the requirements of courses in 
higher education. 


2. Sample 

2.1. The Tests of the TAaAS Project 

However, neither the realization of the students’ underdeveloped algorithmic skills, nor 
the high percentage of dropout CSI students, nor the high number of semesters the stu¬ 
dents spend in their CSI studies (Csernoch and Biro, 2013b; Tan and Venables, 2010) can 
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indicate exactly what the students do or do not know when they start their tertiary educa¬ 
tion. To see clearly what knowledge the students have brought with them, we launched 
the TAaAS project (Testing Algorithmic and Application Skills) in the 2011/2012 aca¬ 
demic year at the Faculty of Informatics of the University of Debrecen, Flungary (Biro 
etal, 2014, 2015, Biro and Csernoch, 2013a, 2013b, 2013c, 2014; Csernocheta/., 2014; 
Csernoch and Biro, 2013a, 2013b, 2013c, 2014a, 2014b, 2014c). 

The TAaAS project is a paper based testing process, which focuses on the level of the 
participants’ algorithmic skills, their usage of terminology, and their problem-solving 
abilities in different software environments. It runs on different levels. Each level has its 
own purposes, but they meet at the end. 

• Introductory test: Testing BSc, BA, MSc, MS students and student teachers of 
Informatics on the first week of their first semester. 

• CAAD test: Testing students’ spreadsheet knowledge after covering spreadsheet 
with a deep approach metacognitive method, entitled the Computer Algorithmic 
and Debugging based approach (CAAD) (Csernoch and Biro, 2015b). 

The primary aims of the testing are to reveal the students’ level of understanding, the 
connection between the similar tasks presented in the different software environments, 
the students’ level of terminology usage, and their level of computational thinking. 

2.1.1. Introductory Test 

The date of the introductory test is strictly set to the first week of the students’ tertiary 
course in order to test the knowledge they have gained in their previous studies, in such 
a way as not to be influenced by the curriculum of the new institute. The introductory 
test includes two questionnaires: 

• General information, attitude and self-assessment questionnaire. This takes about 
10-15 minutes, and then the papers are collected. 

• Informatics questionnaire, which is the primary test. This takes about 45 minutes, 
with eleven task of traditional programming, spreadsheet programming, word 
processing, handling files, and calculation in different numeral systems. 

The general paper includes questions relating to the students’ computer usage habits, 
their results in the school leaving exams in Mathematics and Informatics (SLE, 2014), 
in competitions testing Informatics, and ECDL (2014), if they have taken part in any. 
The other sets of questions focus on the students’ self-assessment: how they evaluate 
their knowledge in the different subfields of Informatics, and what further studies they 
think they need. With the third group of questions we are testing the students’ approach 
towards spreadsheet programming, which is closely related to our CAAD test. 

2.1.2. CAAD Test 

it was realized that the metacognitive problem solving approaches involved in com¬ 
puter related activities have never been mapped. We have found the problem solving ap¬ 
proaches used in other sciences (Case and Gunstone, 2002) and in programming (Booth, 
1992), but not a complete typology to cover all computer related activities. To fill this 
gap, we have merged the previously published problem solving typologies and made 
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the necessary amendments (Csernoch and Biro, 2015c). The result is a typology which 
consists of two deep approach methods - concept based (Polya, 1954; Booth, 1992; Case 
and Gunstone, 2002) and Computer Algorithmic and Debugging (CAAD) based (Booth, 
1992, Csernoch and Biro, 2015b) - and three surface approach methods - algorithmic 
based (Booth, 1992; Case and Gunstone, 2002), information based (Booth, 1992; Case 
and Gunstone, 2002), and Trial-And-Error Wizard (TAEW) based (Csernoch and Biro, 
2015b). The latest covers all the unplanned sequences of activities which are carried out 
in the GUI (Graphical User Interface), without the users necessarily knowing whether 
the output is the result of the original problem or not. 

While building her typology for programming approaches, Booth proved as early as 
the early 90’s that functional languages are effective as introductory languages (Booth, 
1992). The simplicity of these languages allows us to focus on the problem, instead 
of the coding details. However, these findings have never reached the wider public. In 
the meantime, based on similar theoretical backgrounds, a novel tool, spreadsheet pro¬ 
grams appeared on the market. However, the programmability of spreadsheets has never 
reached the wider public, either. These two ends - programming in functional languages 
and a user-friendly tool for coding - have met in Sprego (Csernoch, 2012, 2014; Cser¬ 
noch and Biro; 2015a, 2015c; Csernoch and Balogh, 2011). 

We have introduced Sprego, which is a deep approach metacognitive method to teach 
building algorithms in spreadsheet environment. The core of the method is that spread¬ 
sheet is taught and handled as a programming language (Sestoft, 2010), which would 
serve either as an introductory language for professionals or as the ultimate language 
for end-user programmers (Biro and Csernoch, 2013a; Csernoch, 2012, Csernoch and 
Balogh, 2011; Warren, 2004). The method focuses on the development of algorithmic 
skills. As such it breaks with the traditional, but ineffective, TAEW-based (Trial-And- 
Error Wizard-based) methods in spreadsheets (Csernoch and Biro, 2014b). Its main idea 
is that for novices we introduce as few and as simple functions as possible, and teach 
how to create multilevel functions using these simple functions to solve problems. With 
the development of the students’ skills the number of functions can be increased, but the 
focus is still on simple, not software specified, functions. 

With the CAAD-test our goal was to reveal how the students’ deep approach problem 
solving abilities are developed in non-traditional programming environments. 

2.1.3. Data Sources of the TAaAS Project 

In the evaluation phase of the TAaAS project we focus on comparing the results of the 
tasks presented in different programming environments, and on the results of the differ¬ 
ent sources and testing methods. The statistical analyses of the data deriving from the 
different sources might provide explanations for the unsatisfactory level of our students, 
and guidelines for improving the effectiveness and efficiency of our primary and second¬ 
ary CSI education. 

Our further aim is to widen the project, and to test students from different institutes 
and countries to see the similarities and differences, to reveal which education systems 
best support the development of algorithmic skills and computational thinking and which 
approaches should be adapted to make the other systems work effectively. 
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We do not share those extreme opinions which blame birotical (computer-related of¬ 
fice tools) software (Gove, 2012, 2014) for any distracting effect, and consequently, for 
making CSI education ineffective. We argue that these programs are harmless; the com¬ 
mercialized world developed on the basis of these programs and teachers’ unconditional 
acceptance of the TAEW-based approaches has led us to our current fiasco. We further 
argue that since programs are algorithm driven, if we therefore teach any software from 
an algorithmic point of view, we would be much more successful, and students’ compu¬ 
tational thinking would develop, which is the ultimate goal. 


2.2. Participating Students 

The TAaAS project was launched with three major programming BSc courses at the Fac¬ 
ulty of Informatics of the University of Debrecen (DE): Software Engineering (SOE), 
System Engineering (SYE), and Business Information Management (BIM), and was re¬ 
peated in the following years under similar conditions, with altogether 950 DE students 
taking part (Table 1). The project was extended in the 2013/2014 academic year, when 
three more Hungarian institutes joined: Eotvos Lorand University (ELTE, Budapest), 
Eszterhazy Karoly College (EKF, Eger), and the College of Nyiregyhaza (NYF, Nyir- 
egyhaza) (Biro et ah, 2014; Csernoch et ah, 2014; Csernoch and Biro, 2013b). 

The school leaving exams are held at intermediate and advanced levels, and to enter 
tertiary CSI education students can choose which level they take (SLE, n.d.). Mathemat¬ 
ics is compulsory, while Informatics is not, which explains the lower number of exams 
in Informatics. Due to some students from foreign countries and some uncompleted 
questionnaires, the number of students taking the Mathematics exams is lower than the 
expected 950 (Table 2). 

We have to note here that in Hungary there has been formal CSI education both in 
primary and secondary education since the 1995 National Curricula was launched (NAT, 
1995, 2003, 2007, 2012; European Schoolnet, n.d.). According to these documents, CSI 
studies should start as early as the 1st grade and continue until the 12th grade. However, 
the realization of the National Curricula, in the form of the frame curricula (Kerettanterv 
2000, 2009, 2013), shows different results, and these CSI classes do not provide enough 
space and time for the development of algorithmic skills in traditional programming 
environments. 


Table 1 

The number of students at the Faculty of Informatics of the University of 
Debrecen participating in the TAaAS project in the three testing years 



SOE 

SYE 

BIM 

Sum 

2011/2012 

115 

86 

109 

310 

2012/2013 

108 

111 

101 

320 

2013/2014 

115 

115 

90 

320 

Sum 

338 

312 

300 

950 
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Table 2 

The students’ results in the school leaving exams in Informatics and in Mathematics 



SOE 

SYE 

BIM 

Average 


School leaving exam (SLE) 

- intermediate level 

Informatics 

84.1 

82.2 

80.8 

82.3 


(N=175) 

(N=237) 

(N=193) 

(N=605) 

Mathematics 

74.1 

71.4 

74.4 

73.3 


(N=277) 

(N=265) 

(N=252) 

(N=794) 


School leaving exam (SLE) 

- advanced level 


Informatics 

72.5 

66.4 

55.7 

69.7 


(N=127) 

(N=37) 

(N=16) 

(N=180) 

Mathematics 

68.1 

70.3 

68.9 

69.0 


(N=22) 

(N=17) 

(N=24) 

(N=63) 


The results of the school leaving exams in Informatics were found significantly high¬ 
er than in Mathematics in all the three majors (Wilcoxon signed rank tests: p < 0.001), 
both at intermediate and advanced levels. However, the results of the school leaving 
exams do not show significant differences between the three majors in Mathematics 
(Kruskal-Wallis probe: x 2 (2) = 5.802, p = 0.055). In Informatics significant differences 
were found between the three majors (intermediate: x 2 (2) = 8.451, p = 0.015, advanced: 
X 2 (2)= 14.41, p < 0.001), while the comparison of the pairs found only the SOE students’ 
results differed from those of the BIM students. Given the low number of students taking 
the advanced level Mathematics exam, we cannot say anything about the differences, but 
their results seem very similar (Table 2). 

We can conclude that according to the results in the school leaving exams, the stu¬ 
dents start their tertiary education in CS1 with similar knowledge, both in Informatics 
and in Mathematics, while their problem solving skills are on a higher level in Infor¬ 
matics than in Mathematics. Based on these data, we can assume that students would 
produce similar good results in the test. 


2.3. The Tasks of the TAaAS Project 

In the present article we compare the results of three of the algorithmic tasks of the 
TAaAS tests. The tasks are different in nature. 

• Task 1 : An algorithm with a multilevel if structure, which has an X and Y pair for 
input, with three possible input values - A, B and 0, and with only four possible 
output values - 3, 2, 1 and 0. A table is presented with 9 pairs of inputs, and the 
output cells must be filled in. Previous analyses proved that this is the easiest task 
among the three (Fig. 1) (Biro et al., 2014, 2015). 

• Task 2: Three pseudo codes and an accompanying picture with the input values 
presented. The task is to tell what the codes do (Fig. 2). 
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You draw two cards (X, Y) from two 
packs. In both packs you can find cards 
with the letter A, cards with the letter B 
and cards with zero. Give the points in the 
last column of the table according to the 
algorithm given below! 


V:=X=”A” or Y=”A” 
W:=X=”B” or Y=”B” 

If V and W then Point:=0 
else If V then Point:=1 
else If W then Point:=2 
else Point:=3 


X 

Y 

Point 

A 

A 


A 

B 


A 

0 


B 

A 


B 

B 


B 

0 


0 

A 


0 

B 


0 

0 



Fig. 1. The code, the input values and the empty cells for the outputs of Task 1. 


What do the following programs do? What is stored in variable DB and M? 



N=50, measured values: 

500,500,500,500,500,600, 

600,650,700,750,820,880, 

930,1010,1050,980,930,830, 

780,720,720,710,700,750, 

770.790.820.880.880.820, 
760,740,600,500,560,670, 

780.820.920.880.860.820, 
770,770,760,750,740,740, 
730,720. 


Task 2.1 


Task 2.2 


Task 2.3 


DB:=0 DB:=0 M:=0 

Loop from i=1 to N Loop from i=2 to N-1 Loop from i=2 to N 

If X(i)>800 then DB:=DB+1 If X(i)<X(i-1) and X(i)<X(i+1) If X(i)-X(i-1 )>M then M:=X(i)-X(i-1) 
End loop then DB:=DB+1 End loop 

End loop 


Fig. 2. The accompanying picture, the input data and the pseudo codes of Task 2. 


• Task 3: Programmable spreadsheet problems accompanied with a sample table 
(Fig. 3). The three tasks to solve with formulas - four in 2013/2014 and a mul¬ 
tilevel spreadsheet formula to decode, are similar to Task 2. 

Both Tasks 1 and 2 are borrowed from a Hungarian programming competition for 
students of 5—8th grades (NT, 2009). 

To evaluate Task 2 the different categories of understanding of the SOLO taxonomy 
were adapted to the special circumstances of the test: 

• Ignored (1). 

• Prestructural (2). 

• Unistructural (3). 

• Multistructural (4). 

• Relational (5). 

The Extended abstract category of the original classification is left out, since it has 
no relevance in the test (Biggs and Collis, 1982; Biro and Csernoch, 2014; Clear et al., 
2008; Lister et al, 2006; Sheard et al., 2008). 

When handling spreadsheet as an environment for programming in functional lan¬ 
guages, we were able to adapt the SOLO categories to the spreadsheet solutions (Biro 
and Csernoch, 2014). Using this method we can compare traditional and non-traditional 
programming solutions, the students’ level of understanding, and their approaches to 
problem-solving in different environments. 
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A 

B 

C 

D 

E 

1 

Country 

Continent 

Capital 

Area 

Population 

(thousand) 

2 

Afghanistan 

Asia 

Kabul 

647500 

27756 

3 

Albania 

Europe 

Tirana 

28748 

3545 

4 

Algeria 

Africa 

Algiers 

2381740 

32278 

5 

American Samoa 

Oceania 

Pago Pago 

199 

69 

6 

Andorra 

Europe 

Andorra la Vella 

468 

68 

7 

Angola 

Africa 

Luanda 

1246700 

10593 

8 

Anguilla 

America 

The Valley 

102 

12 

233 

Yemen 

Asia 

Sanaa 

527970 

18701 

234 

Yugoslavia 

Europe 

Belgrade 

102350 

10657 

235 

Zambia 

Africa 

Lusaka 

752614 

9959 

236 

Zimbabwe 

Africa 

Harare 

390580 

11377 


Task 3.1 How many African countries are in the table? 

Task 3.2 What is the average population of those countries whose surface area is 
smaller than G1? 

Task 3.3 How many countries have a surface area greater than G1? 

Task 3.4 {=SUM(IF(B2:B236="EuropeMF(LEFT(A2:A236)="A",1)))} 


Fig. 3. The sample table and problems of Task 3. 


3. Hypotheses 


H1: Students of Informatics start their tertiary’ education with well- 
developed algorithmic skills and computational thinking, with a 
firm concept of computers, and with correct terminology usage. 

H2: The results of the school leaving exams are able to tell apart the 
different levels of knowledge, and serve as a correct measure of 
algorithmic skills. 

H3: The students of Informatics have the knowledge to provide correct 
self-assessment values, indicating their knowledge in the different 
fields of Informatics. 

H4: The students’ algorithmic skills are software independent. 


4. Results 

4.1. Classification of Students 

4.1.1. Preliminary Groups 

The test results and the self-assessment values of the three majors at the University of 
Debrecen (SOE, SYE, BIM - preliminary groups) are presented in Fig. 4-Fig. 6. 

The students’ self-assessment values are higher in spreadsheet than in programming 
in all the three preliminary groups; however, their results in the test proved exactly the 
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S-A-S 
-A- S-A-P 
task 1 
—task 2 
■ El- task 3 


Fig. 4. The students’ results in the three tasks compared to their self-assessment 
values in programming (S-A-P) and spreadsheet (S-A-S). 



S-A-P 
-A- task 1 
task 2.1 
H- task 2.2 
-El- task 2.3 


Fig. 5. The students’ results in Tasks 1 and 2 compared to their programming 
self-assessment values (S-A-P). 


opposite (Fig. 4). Task 1 is found to be the easiest among the three different kinds of 
tasks, and Task 3 the most difficult. BIM is the only group whose spreadsheet result is 
slightly above the decoding tasks; in the other groups the results of both Tasks 1 and 2 
are better than Task 3. 

It is clear from figures Fig. 4 and Fig. 5 that the students’ results in Task 1 are above 
both their programming self-assessment values and the results of the other two tasks. The 
dominance of Task 1 over Task 2 is the straightforward consequence of the requirements 
of the tasks. In Task 1 the students had to do nothing other than pick one out of the four 
numbers, while in Task 2 semantically correct natural language sentences had to be writ¬ 
ten after decoding the pseudo codes. The spreadsheet problems were found more difficult 
than the students predicted (Fig. 4). Flowever, if we study the most recently published 










184 


M. Csernoch et al. 



■+- S-A-S 
task 3.1 
Hi- task 3.2 
—(- task 3.3 
El- task 3.4 


Fig. 6. The students’ results in the spreadsheet task compared to 
their spreadsheet self-assessment values (S-A-S). 


papers dealing with the high number of spreadsheet documents with errors (EuSprig, 
205; Panko and Aurigemma, 2010; Powell et al, 2008; Tort et al., 2008), and the undue 
length of time spent creating birotical documents (Van Deursen and Van Dijk, 2012), we 
can conclude that the results of the test are not surprising at all. Beyond finding evidence 
in these papers for the incorrectness of the spreadsheet documents, it is also covertly 
evident that end-users either felt so comfortable about their knowledge or are so ignorant 
of their non-existing knowledge that it does not occur to them to check their documents, 
which is the well-known Dunning-Kruger effect (Kruger and Dunning, 1999). 

Fig. 6 clearly shows that the students’ spreadsheet self-assessment values (S-A-S) 
greatly exceed their results in the test. Beyond this fact, Task 3.1 shows similar results 
to 3.4, while Tasks 3.2 is similar to 3.3 in the three majors (Wilcoxon signed rank tests: 
Tasks 3.1 and 3.4, SOE, SYE, BIM: p = 0.3168, p = 0.4699, p = 0.4635; Tasks 3.2 and 
3.3: p = 0.02531, p = 0.2642, p = 0.3363, respectively). The only significant difference 
was found in the SOE group between Tasks 3.2 and 3.3. The differences between the 
pairs are significant (Wilcoxon signed rank tests: pairs Tasks 3.1-3.4 and Tasks 3.2-3.3: 
SOE, SYE, BIM: p < 0.001). The differences between these two pairs can be explained 
by the nature of the tasks. Task 3.1 is a simplified task using constant and checking 
equality in the condition. Task 3.4 is a complete array formula for decoding, which is 
easier than creating formulas. On the other hand, both Tasks 3.2 and 3.3 are generaliza¬ 
tions of Task 3.1. 

The other characteristic of Task 3.4 is that it is included with the purpose that it 
would serve as guideline for solving Tasks 3.1-3.3. However, the analyses of the solu¬ 
tions proved that the students did not realize the connection between these four tasks 
(Csernoch and Biro, 2014b). Students learn rules which work only for special problems, 
and this is, as such, a surface approach method, referred to as the algorithmic-based 
problem solving method in the Gase & Gunstone system (Case and Gunstone, 2002) 
and as expedient in Booth system (Booth, 1992). We must note here that the surface ap¬ 
proach methods referred to as algorithmic-based and expedient - in Case & Gunstone’s 
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and Booth’s typologies, respectively - are different from our computer-algorithmic and 
debugging (CAAD) deep approach methods (Csernoch and Biro, 2014b; Csernoch and 
Biro, 2015c). Consequently, in this analysis we found that students are not able to do ab¬ 
straction - their level of understanding is extremely low (for details see section 4.2.2). 


4.1.2. Knowledge-Based Clusters in Task 1 

Since Task 1 was found to be the most successful among the three tasks, it was used to 
define knowledge-based clusters. Four clusters were easily distinguishable; C1L, C2L, 
C3L and C4L, ranging from the best results to the worst, respectively. Those students 
who did not try or complete Task 1 are placed in C4L. 

The result of C2L is closer to C1L, while C3L is closer to C4L. However, the com¬ 
parison between the results from C2L and C3L revealed a remarkable difference be¬ 
tween the two middle level clusters. Their knowledge is different in nature. As we show 
in Fig. 7, C2L indicates an average, uncertain knowledge, without any definable pattern. 
Their knowledge is quite arbitrary. However, C3L is different in nature. In certain cases 
they do almost as well as C1L, while in other cases their results are disastrous. This be¬ 
havior of C3L led us to the conclusion that these students have limited knowledge. Until 
they reach their limit they are able to solve the problems almost perfectly; however be¬ 
yond their limit they try to find escape routes, and this strategy leads them in completely 
false directions. In the case of Task 1 C3L did well when the input values of the X, Y 
pairs were A and B (Fig. 7, #1, 2, 4, 5), however, when one of the input values was 0 they 
concluded that the output should also be 0 (Fig. 7, #3, 6, 7, 8, 9). 

Similar results were found when comparing the different Hungarian institutes in the 
2013/2014 academic year (for details see Biro et al., 2014). The same four clusters with 
the very same characteristics were found in the other institues, which clearly show that 
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Fig. 7. The results of the three clusters in Task 1. 
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the results are not specific to the University of Debrecen, but rather a general tendency. 
To obtain further data we are planning to test other Hungarian and foreign institutes in 
the following years. 


4.2. Comparison of the Students Results 

4.2.1. Results of Task 2 

The results of the four knowledge based clusters in Task 2 are presented in Table 3, and 
the behavior of the different clusters can be tracked in Fig. 8. The results for the clusters, 
similar to the results in Task 1, follow a decreasing order in C1L, C2L, C3L and C4L, re¬ 
spectively. The behavior of C3L, which can be recognized in Task 1, is also detectable in 
Task 2; while C2L reaches Level 3 in all the three tasks with the highest percentage, until 
C3L stops at Level 1. This is the only level where C3L is above C2L. The knowledge of 
C3L in the decoding tasks of advanced primary school students is extremely limited. 

4.2.2. Results of Task 3 

It is clearly evident, both for the preliminary groups and for the C1L-C4L knowledge 
based clusters, that the students’ results in spreadsheet are significantly lower than their 
self-evaluation values (Wilcoxon signed rank tests: p < 0.001). 

The spreadsheet problems of the test proved to be more difficult than the decoding of 
the pseudo codes of Task 2. However, the result for the decoding of the array formula of 
Task 3.4 is similar to the decoding of the traditional pseudo codes (Table 3). 
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Fig. 8. The students’ levels of understanding using the categories of the SOLO-taxonomy. 
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Table 3 

The results of the four clusters in Tasks 2 and 3 based on the logical task (Task 1), 
and the self-assessment values in programming (S-A-P) and spreadsheet (S-A-S) 



C1L 

C2L 

C3L 

C4L 

N 

279 

205 

210 

256 

S-A-P (%) 

46.9 

35.9 

32.4 

23.0 

Task 1 (%) 

99.8 

70.3 

51.2 

0 

Task 2.1 (%) 

57.3 

39.1 

37.6 

18.7 

Task 2.2 (%) 

46.1 

23.7 

20.8 

9.5 

Task 2.3 (%) 

38.8 

21.7 

17.6 

7.7 

S-A-S (%) 

75.3 

72.4 

74.5 

71.5 

Task 3.1 (%) 

38.5 

28.8 

26.1 

24.7 

Task 3.2 (%) 

13.5 

8.9 

9.1 

7.4 

Task 3.3 (%) 

19.3 

15.7 

11.5 

11.8 

Task 3.4 (%) 

41.0 

29.3 

32.4 

23.4 


Using the same knowledge based clusters, C1L-C4L, the results of Task 3 were ana¬ 
lyzed (Table 3). The differences between the clusters are significant, except for Task 3.3 
(Kruskal-Wallis: Task 3.1: p < 0.001, Task 3.2: p = 0.034, Task 3.3: p = 0.3804, Task 3.4: 
p < 0.001). However, when comparing the different pairs of clusters the differences are 
not significant. In fact, only a couple of pairs were found with significant differences: 
in Task 3.1 pairs C1L-C3L and C1L-C4L, in Tasks 3.2 and 3.3 none, in Task 3.4 pair 
C1L-C4L. 

The students were not able to recognize the similarities between Tasks 3.1-3.3 and 
Task 3.4, and were not able to copy the solution of Task 3.4 to their formulas, but they 
were able to decode it more easily than the traditional loops. This finding strengthens our 
previously published results, which showed spreadsheet could be used as an introduc¬ 
tory programming language using CAAD-based deep approach metacognitive methods 
(Biro and Csernoch, 2013a; Csernoch and Biro, 2014b). 

We have to confront the unfortunate situation that the clusters which worked well for 
the categorization of the students in the programming task, do not work in the spread¬ 
sheet task; they are not able to distinguish between the different levels of the students’ 
spreadsheet knowledge. This means that the students’ spreadsheet knowledge is not con¬ 
nected to their ability to solve logical or programming tasks. 

Considering these results, we selected another spreadsheet task in the test in order to 
create new knowledge-based clusters: ‘What is the capital city of the largest country?’ 
- Task 3.0. The characteristic of this spreadsheet task is that it requires both a knowl¬ 
edge of basic spreadsheet functions - index(), match() and max() - and the ability to 
handle multilevel functions, and as such, is related to Mathematics, to programming and 
to spreadsheet. 

Three knowledge-based clusters were recognizable using Task 3.0: CIS, C2S and 
C3S, moving from the best to the worst, respectively (Table 4, Fig. 9). Those students 
who scored 0% in Task 3.0 make up C3S, and there is an extremely high number of these 
students (452 students, 47.58%), compared to those in C4L (256 students, 26.95%). 
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Table 4 

The results of the three clusters based on Task 3.0 and 
their self-assessment values in spreadsheet (S-A-S) 



CIS 

C2S 

C3S 

Average 

N 

185 

313 

452 

950 

S-A-S (%) 

79.6 

74.5 

70.5 

73.5 

Task 3.0 (%) 

47.64 

9.78 

0.00 

12.5 

Task 3.1 (%) 

49.64 

36.41 

17.41 

29.95 

Task 3.2 (%) 

17.67 

15.22 

2.98 

9.87 

Task 3.3 (%) 

25.76 

20.30 

5.25 

14.86 

Task 3.4 (%) 

47.57 

37.91 

21.17 

31.82 
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Fig. 9. The results of the three S clusters in solving the spreadsheet tasks, 
applying the SOLO categories of understanding. 


This might explain why only two more clusters were distinguishable on the basis of 
Task 3.0. 

Similar results were found using the C1S-C3S clusters as with the C1L-C4L clus¬ 
ters. There are significant differences between the results of the three clusters; however 
the comparison of the pairs hardly show any differences between the three clusters 
(Kruskal-Wallis: p < 0.001). We found the following pairs with significant differences: 
Task 3.1 is the only problem where all the pairs are significantly different; in Task 3.2, 
3.3 and 3.4 pairs C1S-C3S and C1S-C3S, which means that only C3S is different from 
the others, while we cannot deduce anything from the two better groups. We can con- 
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elude that selecting a spreadsheet task as the basis for the knowledge based clusters 
proved no better than selecting a logical task (Task 1). These results further strengthen 
our previously published results which showed that students solve spreadsheet prob¬ 
lems with surface approach methods, without supporting their solutions with algo¬ 
rithms. For better understanding and better results in solving spreadsheet problems 
CAAD-based deep approach methods should be applied. To obtain proof for this result 
we applied the SOLO categories of understanding to the spreadsheet solutions (Biro 
and Csernoch, 2014). 

Fig. 9 presents the levels of understanding in the spreadsheet tasks of the different 
S clusters. Unfortunately, in all the three tasks the most frequent level is 1, Ignored. 
From now on we focus on the non-zero results. C3S stops at Level 1, in all the four 
spreadsheet tasks. A couple of students in the cluster solved Task 3.4 at Level 5, and 
some non-zero solutions were found for Task 3.1, but almost nothing for Tasks 3.2 and 

3.3. In C2S and C3S in Task 3.1 the most frequent level of understanding is 3, Unistruc- 
tural understanding. The difference between the two clusters is that C2S stops at this 
level, while CIS reaches Level 5. In Task 3.2, which is a three folded generalization of 
Task 3.1, both C2S and C3S stop at Level 3, so they are at the Prestructural and Uni- 
structural levels, in CIS in Task 3.3, which is a two folded generalization of Task 3.1, 
the most frequent level is 3, but we can find solutions both at Levels 4 and 5. On the 
other hand, in C2S Level 2 is the most frequent, and the others follow in descending 
order. In Task 3.4 the pattern is simple: as the number of clusters increases the level of 
understanding decreases. 

It is remarkable that Level 4 is the least frequent. This means that spreadsheet knowl¬ 
edge, unlike traditional programming, can hardly be categorized as Multistructural. Stu¬ 
dents either know the solution or they do not. We argue that it shoidd not be like this, 
and that TAEW-based methods have led the students to this state. Beyond this, we have 
proved that when teaching spreadsheet with a CAAD-based method the importance of 
Level 4 increases (Csernoch and Biro, 2014b). 

4.2.3. School Leaving Exams 

With the comparison of the students’ results in the test and in the school leaving exams 
we wanted to see what is gained or lost by not considering the Informatics school leaving 
exam as a compulsory requirement for entering tertiary studies in Informatics. 

Both the L and S knowledge based clusters were used to compare the students’ re¬ 
sults in the test to their results in the school leaving exams. Table 3, Figures 10 and 11 
show that there are differences between the clusters in terms of the results of the school 
leaving exams. The question was whether these differences are significant, and whether 
the results of the school leaving exams are able to distinguish between the levels of the 
students or not. 

At advanced level Informatics only the C1L result was found to be significantly 
higher than that of the other clusters (Kruskal-Wallis probe: advanced: % 2 (3) = 14.53, p 
= 0.002, intermediate: x 2 (3) = 8.7, p = 0.012). In the comparison of the pairs at interme¬ 
diate level C1L is different from the others, while at advanced level C1L can be distin¬ 
guished both from C2L and C3L. 


Mathematics - SLE results (%) Informatics - SLE results (%) 
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Table 5 

The L clusters’ results in the school leaving exams 



C1L 

C2L 

C3L 

C4L 


School leaving exam (SLE) 

- intermediate level 

Informatics 

84.1 

80.6 

82.1 

82.2 


(N=154) 

(N=135) 

(N=147) 

(N=169) 

Mathematics 

77.2 

69.7 

72.2 

73.1 


(N=225) 

(N=177) 

(N=179) 

(N=213) 


School leaving exam (SLE) 

- advanced level 


Informatics 

74.6 

65.1 

65.1 

66.3 


(N=85) 

(N=40) 

(N=28) 

(N=27) 

Mathematics 

75.1 

62.7 

64.0 

67.2 


(N=24) 

(N=10) 

(N=10) 

(N=19) 


advanced 

intermediate 

J , 




Fig. 10. The results of the C1L-C4L and the C1S-C2S knowledge-based clusters 
in the school leaving exams in Informatics at intermediate and advanced levels. 
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Fig. 11. The results of the C1L-C4L and the C1S-C2S knowledge-based clusters 
in the school leaving exams in Mathematics at intermediate and advanced levels. 
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Analyzing the connection between the S clusters and the results of the school leav¬ 
ing exams in Informatics, both at intermediate and advanced level, significant differ¬ 
ences were found (Kruskal-Wallis probe: advanced: x 2 (2) = 17.88, p < 0.001, interme¬ 
diate: X 2 (2) = 10.16, p < 0.001). Comparison of the pairs shows that CIS is different 
from both C2S and C3S at both levels but that there is no difference between the C2S 
and C3S pairs. 

The comparison of the students’ results in the test and in the Informatics school leav¬ 
ing exams clearly shows that the school leaving exams, especially at the intermediate 
level, are not able to distinguish between the different levels of knowledge. The best 
students do better in these exams, while there are no significant differences between 
the others’ results. The connection between the school leaving exams and the S clusters 
tends to be stronger than with the L clusters, a result which led us to the conclusion that 
school leaving exams in Informatics, especially at intermediate level do not support the 
algorithmic approaches, and as such should not serve as a selector for entry to tertiary 
CSI education. 

Consequently, not making the Informatics school leaving exam a requirement for en¬ 
rolling on a course, nor as a measure to distinguish between students, is not a great loss. 
Although in itself this is bad news, since it questions the reliability of this exam, on the 
other hand, it seems a wise decision not to include it as a compulsory subject. 

When comparing the results in the Mathematics school leaving exams and the clus¬ 
ters, we must first of all note that a very low number of students take the advanced level 
exams (Table 5). At intermediate level the school leaving exams, both in the L and S 
clusters, show significant differences (Kruskal-Wallis probe x 2 (3) = 25.79, p < 0.001, 
X 2 (2)= 10.16, p < 0.001). When comparing the pairs in Mathematics in the L clusters 
C1L was found to be different from the other three clusters, but no other pairs were 
found which differed from each other. In a similar way, in the S clusters CIS is different 
both from C2S and C3S, but there is no difference between C2S and C3S. 

We can conclude that neither the Informatics nor the Mathematics school leaving 
exam at intermediate level are able to tell apart the different levels of the students’ algo¬ 
rithmic skills and programming abilities. While the school leaving exams at intermediate 
level were not able to distinguish between the different majors, the advanced Informatics 
and our knowledge based clusters performed better. 

4.2.4. Clusters and the Majors 

We have seen that the school leaving exams were not able to distinguish between the 
different majors (Table 2). However, the knowledge based L clusters clearly show that 
the SOE students are present in the first clusters in the highest percentage, while BIM 
students appear in the lowest clusters in the highest percentage (Table 6). The BIM and 
the SOE students move in opposite directions. The SYE students in the L clusters are 
almost equally distributed (Pearson’s Chi-squared test: x 2 (6) = 100.1, p < 0.001). 

It is difficult to discover anything more about the connection between the majors and 
the S clusters because of the extremely high number of Level 1 performances produced 
by the students in the spreadsheet problems (Table 7): in all the three majors the percent¬ 
age of the students who were not able to solve Task 3.0 is between 45 and 50% (C3S). 
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Table 6 

The number of students in the C11-C4L clusters compared to the majors 



C1L 

C2L 

C3L 

C4L 

Total 

BIM 

47 

58 

71 

124 

300 


15.68% 

19.33% 

23.67% 

41.33% 

31.58% 

SYE 

78 

74 

74 

86% 

312 


25.00% 

23.72% 

23.72% 

27.67% 

32.84% 

SOE 

154 

73 

65 

46 

338 


45.56% 

21.60% 

19.23 

13.61% 

35.58% 

Total 

279 

205 

210 

2560 

950 


Table 7 

The number of students in the C1S-C3S clusters compared to the majors 



CIS 

C2S 

C3S 

Total 

BIM 

40 

111 

149 

300 


13.33% 

37.00% 

49.67% 

31.58% 

SYE 

64 

109 

139 

312 


20.51% 

34.945 

44.55% 

32.84% 

SOE 

81 

93 

164 

338 


23.96% 

27.51 

48.52% 

35.58% 

Total 

185 

313 

452 

950 


The tendencies are the following: the percentage of the BIM students in the CIS cluster 
is the lowest, while their percentage in C3S is the highest. The percentage of the SOE 
students in CIS is the highest, while the SYE students lie between BIM and SOE. These 
results are similar to those found with the L clusters. 

However, more SYE than SOE students are classified as C2S (Pearson’s Chi-squared 
test: x 2 (4) = 15.32, p = 0.004). The differences using the S clusters are statistically sig¬ 
nificant; however, these clusters are not able to distinguish the majors as well as the L 
clusters. 


5. Conclusions 


Testing the first year students of Informatics when starting their tertiary education in 
Hungary proved that students arrive from secondary education with underdeveloped al¬ 
gorithmic skills, and a low level of understanding programming tasks; consequently, the 
HI hypothesis is not proved. Our high level of expectation was mainly based on the high 
results achieved in the school leaving exams in Informatics. The reason that HI must 
be rejected is that we have proved by the analyses of the students’ results in the test that 
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the school leaving exams in Informatics do not measure the students’ algorithmic skills. 
These exams are able to distinguish the best students from the others, but are not able to 
indicate the differences between the different levels of weakness. Consequently, we also 
have to reject H2. Our results also suggest that spreadsheet problems are solved with 
TAEW-based methods, and not through a deep-structural algorithmic approach. 

It was found that the students’ self-evaluation in programming is quite acceptable. 
However, we have proved in our previous analyses that the best students are aware of 
their knowledge, while the worst evaluate their knowledge at approximately the same 
level as the best (Biro et al., 2015), which is in accordance with the well-known Dun- 
ning-Kruger effect (Kruger and Dunning, 1999). However, the spreadsheet self-assess¬ 
ment values greatly exceed the students’ real knowledge. Based on the Dunning-Kruger 
effect, we can conclude that in spreadsheet the students’ low level of knowledge pre¬ 
vents them from forming a reliable self-assessment. The H3 hypothesis only works with 
the best students in programming, and not with any of the preliminary groups, nor the S 
the knowledge-based clusters. 

Our analyses have proved that the students can only think in algorithms in traditional 
programming environments. They do not regard the recent developments in IT as being 
driven by algorithms; they solve problems in these non-traditional environments with 
surface approach methods and are not able to do generalizations and abstractions. Con¬ 
sequently, our H4 hypothesis must be rejected. 

Considering the results of the analyses, if the universities rely heavily on the school 
leaving exams, their course structure and focus should be changed. On the other hand, 
methods must be developed and introduced in primary and secondary education which 
focus on the development of the students’ algorithmic skills in different environments 
and in different computer related activities; the development of students’ algorithmic 
skills should be independent of computer environments and computer related activities. 

Beyond our local gains and losses, the state of CS1 in primary and secondary edu¬ 
cation, its effectiveness, its efficiency, and its global methodological questions require 
more data collection and analyses. The problem is not Hungarian specific, it is border. 
Beyond considering the methodology of CSI, there is a great need for international and 
national standards of terminology, which should be CSI and methodology driven, in¬ 
stead of being commercially based. 
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Algoritminiq jgudziq testavimas tradicinese ir netradicinese 
programavimo aplinkose 

Maria CSERNOCH, Piroska BIRO, Janos MATH, Kalman ABAR1 

Projektas „Algoritminiij ir taikonnjji} jgudziij testavimas" buvo pradetas 2011-2012 akademi- 
niais mokslo metais, buvo tiriama pirmijjij meti} informatikos studenti} algoritmavimo jgudziais 
tradicinese bei netradicinese programavimo aplinkose bei kaip jij infonnatikos zinios persikelia 
is vidurines mokyklos } universitetq. Tyrimo rezultatai aiskiai parode, kad studentai pradeda in¬ 
formatikos studijas su labai neisvystytais algoritminiais jgudziais, tik keli studentai demonstruoja 
geresnius rezultatus. Norint rasti priezastis, buvo isanalizuoti studenti} taikomi problemi} sprendi- 
mo metodai. Nustatyta, kad studentai dazniausiai naudoja tik tradicines programavimo aplinkas, 
skirtas pletoti informatinio m^stymo ir algoritminius jgudzius. Be to, jie nenaudoja konceptp ir 
algoritmais grjstvj metodi} netradicinese su kompiuteriu susietose veiklose. dazniausiai taiko ne- 
veiksmingus pavirsinius metodus, kurie naudojami pradinese ir vidurinese mokyklose. Tai gali 
paaiskinti atotrukj tarp aukstpji} mokykli} lukescii} ir mokinii} baigiaimjji} egzamini} rezultati} bei 
to, kaip mokiniai patys jvertina (pervertina) savo zinias. 



